Overview

Dataset statistics

Number of variables40
Number of observations260601
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory79.5 MiB
Average record size in memory320.0 B

Variable types

BOOL22
NUM9
CAT9

Warnings

building_id has unique values Unique
geo_level_1_id has 4011 (1.5%) zeros Zeros
age has 26041 (10.0%) zeros Zeros
count_families has 20862 (8.0%) zeros Zeros

Reproduction

Analysis started2020-09-29 02:59:50.968623
Analysis finished2020-09-29 03:00:56.706366
Duration1 minute and 5.74 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

building_id
Real number (ℝ≥0)

UNIQUE

Distinct260601
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean525675.4828
Minimum4
Maximum1052934
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-09-29T11:00:56.927407image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile52114
Q1261190
median525757
Q3789762
95-th percentile1000724
Maximum1052934
Range1052930
Interquartile range (IQR)528572

Descriptive statistics

Standard deviation304544.999
Coefficient of variation (CV)0.5793403136
Kurtosis-1.203878964
Mean525675.4828
Median Absolute Deviation (MAD)264277
Skewness0.001882356737
Sum1.369915565e+11
Variance9.274765644e+10
MonotocityNot monotonic
2020-09-29T11:00:57.101494image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10526701< 0.1%
 
8473041< 0.1%
 
3681021< 0.1%
 
7299861< 0.1%
 
9005781< 0.1%
 
8964801< 0.1%
 
8084151< 0.1%
 
8125051< 0.1%
 
2902641< 0.1%
 
2697821< 0.1%
 
Other values (260591)260591> 99.9%
 
ValueCountFrequency (%) 
41< 0.1%
 
81< 0.1%
 
121< 0.1%
 
161< 0.1%
 
171< 0.1%
 
ValueCountFrequency (%) 
10529341< 0.1%
 
10529311< 0.1%
 
10529291< 0.1%
 
10529261< 0.1%
 
10529211< 0.1%
 

geo_level_1_id
Real number (ℝ≥0)

ZEROS

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.90035341
Minimum0
Maximum30
Zeros4011
Zeros (%)1.5%
Memory size2.0 MiB
2020-09-29T11:00:57.249820image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median12
Q321
95-th percentile27
Maximum30
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.033616625
Coefficient of variation (CV)0.5779433361
Kurtosis-1.213248785
Mean13.90035341
Median Absolute Deviation (MAD)6
Skewness0.2725303548
Sum3622446
Variance64.53899608
MonotocityNot monotonic
2020-09-29T11:00:57.394399image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%) 
6243819.4%
 
26226158.7%
 
10220798.5%
 
17218138.4%
 
8190807.3%
 
7189947.3%
 
20172166.6%
 
21148895.7%
 
4145685.6%
 
27125324.8%
 
Other values (21)7243427.8%
 
ValueCountFrequency (%) 
040111.5%
 
127011.0%
 
29310.4%
 
375402.9%
 
4145685.6%
 
ValueCountFrequency (%) 
3026861.0%
 
293960.2%
 
282650.1%
 
27125324.8%
 
26226158.7%
 

geo_level_2_id
Real number (ℝ≥0)

Distinct1414
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean701.0746851
Minimum0
Maximum1427
Zeros38
Zeros (%)< 0.1%
Memory size2.0 MiB
2020-09-29T11:00:57.530666image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile69
Q1350
median702
Q31050
95-th percentile1377
Maximum1427
Range1427
Interquartile range (IQR)700

Descriptive statistics

Standard deviation412.7107336
Coefficient of variation (CV)0.5886829782
Kurtosis-1.188232475
Mean701.0746851
Median Absolute Deviation (MAD)349
Skewness0.02895738139
Sum182700764
Variance170330.1496
MonotocityNot monotonic
2020-09-29T11:00:57.684192image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3940381.5%
 
15825201.0%
 
18120800.8%
 
138720400.8%
 
15718970.7%
 
36317600.7%
 
46317400.7%
 
67317040.7%
 
53316840.6%
 
88316260.6%
 
Other values (1404)23951291.9%
 
ValueCountFrequency (%) 
038< 0.1%
 
12040.1%
 
377< 0.1%
 
43150.1%
 
525< 0.1%
 
ValueCountFrequency (%) 
14276< 0.1%
 
14262860.1%
 
14254660.2%
 
14247< 0.1%
 
14233< 0.1%
 

geo_level_3_id
Real number (ℝ≥0)

Distinct11595
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6257.876148
Minimum0
Maximum12567
Zeros2
Zeros (%)< 0.1%
Memory size2.0 MiB
2020-09-29T11:00:57.860052image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile611
Q13073
median6270
Q39412
95-th percentile11927
Maximum12567
Range12567
Interquartile range (IQR)6339

Descriptive statistics

Standard deviation3646.369645
Coefficient of variation (CV)0.5826848532
Kurtosis-1.213896506
Mean6257.876148
Median Absolute Deviation (MAD)3171
Skewness0.0003935120899
Sum1630808782
Variance13296011.59
MonotocityNot monotonic
2020-09-29T11:00:58.035106image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
6336510.2%
 
91336470.2%
 
6215300.2%
 
112464700.2%
 
20054660.2%
 
114404550.2%
 
77234430.2%
 
92293810.1%
 
24523490.1%
 
122583120.1%
 
Other values (11585)25589798.2%
 
ValueCountFrequency (%) 
02< 0.1%
 
16< 0.1%
 
39< 0.1%
 
514< 0.1%
 
621< 0.1%
 
ValueCountFrequency (%) 
125671< 0.1%
 
125657< 0.1%
 
125646< 0.1%
 
1256324< 0.1%
 
125623< 0.1%
 

count_floors_pre_eq
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.129723217
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-09-29T11:00:58.176705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7276645453
Coefficient of variation (CV)0.3416709456
Kurtosis2.322597881
Mean2.129723217
Median Absolute Deviation (MAD)0
Skewness0.8341129586
Sum555008
Variance0.5294956905
MonotocityNot monotonic
2020-09-29T11:00:58.290987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
215662360.1%
 
35561721.3%
 
14044115.5%
 
454242.1%
 
522460.9%
 
62090.1%
 
739< 0.1%
 
91< 0.1%
 
81< 0.1%
 
ValueCountFrequency (%) 
14044115.5%
 
215662360.1%
 
35561721.3%
 
454242.1%
 
522460.9%
 
ValueCountFrequency (%) 
91< 0.1%
 
81< 0.1%
 
739< 0.1%
 
62090.1%
 
522460.9%
 

age
Real number (ℝ≥0)

ZEROS

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.53502865
Minimum0
Maximum995
Zeros26041
Zeros (%)10.0%
Memory size2.0 MiB
2020-09-29T11:00:58.435961image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median15
Q330
95-th percentile60
Maximum995
Range995
Interquartile range (IQR)20

Descriptive statistics

Standard deviation73.56593652
Coefficient of variation (CV)2.772408408
Kurtosis157.2482363
Mean26.53502865
Median Absolute Deviation (MAD)10
Skewness12.19249422
Sum6915055
Variance5411.947016
MonotocityNot monotonic
2020-09-29T11:00:58.746828image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%) 
103889614.9%
 
153601013.8%
 
53369712.9%
 
203218212.3%
 
02604110.0%
 
25243669.3%
 
30180286.9%
 
35107104.1%
 
40105594.1%
 
5072572.8%
 
Other values (32)228558.8%
 
ValueCountFrequency (%) 
02604110.0%
 
53369712.9%
 
103889614.9%
 
153601013.8%
 
203218212.3%
 
ValueCountFrequency (%) 
99513900.5%
 
200106< 0.1%
 
1952< 0.1%
 
1903< 0.1%
 
1851< 0.1%
 

area_percentage
Real number (ℝ≥0)

Distinct84
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.018050583
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-09-29T11:00:58.911202image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median7
Q39
95-th percentile16
Maximum100
Range99
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.392230936
Coefficient of variation (CV)0.5477928694
Kurtosis30.43825794
Mean8.018050583
Median Absolute Deviation (MAD)2
Skewness3.526082314
Sum2089512
Variance19.29169259
MonotocityNot monotonic
2020-09-29T11:00:59.075844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
64201316.1%
 
73675214.1%
 
53272412.6%
 
82844510.9%
 
9221998.5%
 
4192367.4%
 
10156136.0%
 
11139075.3%
 
3118374.5%
 
1275812.9%
 
Other values (74)3029411.6%
 
ValueCountFrequency (%) 
190< 0.1%
 
231811.2%
 
3118374.5%
 
4192367.4%
 
53272412.6%
 
ValueCountFrequency (%) 
1001< 0.1%
 
963< 0.1%
 
901< 0.1%
 
865< 0.1%
 
854< 0.1%
 

height_percentage
Real number (ℝ≥0)

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.434365179
Minimum2
Maximum32
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-09-29T11:00:59.215007image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q14
median5
Q36
95-th percentile9
Maximum32
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.918418221
Coefficient of variation (CV)0.3530160667
Kurtosis14.31852616
Mean5.434365179
Median Absolute Deviation (MAD)1
Skewness1.808261757
Sum1416201
Variance3.68032847
MonotocityNot monotonic
2020-09-29T11:00:59.342374image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%) 
57851330.1%
 
64647717.8%
 
43776314.5%
 
73546513.6%
 
32595710.0%
 
8139025.3%
 
293053.6%
 
953762.1%
 
1044921.7%
 
119170.4%
 
Other values (17)24340.9%
 
ValueCountFrequency (%) 
293053.6%
 
32595710.0%
 
43776314.5%
 
57851330.1%
 
64647717.8%
 
ValueCountFrequency (%) 
3275< 0.1%
 
311< 0.1%
 
282< 0.1%
 
262< 0.1%
 
253< 0.1%
 
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
t
216757 
n
35528 
o
 
8316
ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 
2020-09-29T11:00:59.482234image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T11:00:59.558324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:59.642439image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

foundation_type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
r
219196 
w
 
15118
u
 
14260
i
 
10579
h
 
1448
ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 
2020-09-29T11:00:59.762643image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T11:00:59.850357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:59.960580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

roof_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
n
182842 
q
61576 
x
 
16183
ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 
2020-09-29T11:01:00.081211image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T11:01:00.170751image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:01:00.270052image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
f
209619 
x
24877 
v
24593 
z
 
1004
m
 
508
ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 
2020-09-29T11:01:00.384794image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T11:01:00.470302image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:01:00.582636image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

other_floor_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
q
165282 
x
43448 
j
39843 
s
 
12028
ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 
2020-09-29T11:01:00.707274image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T11:01:00.793302image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:01:00.890134image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

position
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
s
202090 
t
42896 
j
 
13282
o
 
2333
ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 
2020-09-29T11:01:01.016359image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T11:01:01.107621image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:01:01.202683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
d
250072 
q
 
5692
u
 
3649
s
 
346
c
 
325
Other values (5)
 
517
ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 
2020-09-29T11:01:01.316671image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T11:01:01.405558image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:01:01.550617image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
237500 
1
 
23101
ValueCountFrequency (%) 
023750091.1%
 
1231018.9%
 
2020-09-29T11:01:01.622829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
1
198561 
0
62040 
ValueCountFrequency (%) 
119856176.2%
 
06204023.8%
 
2020-09-29T11:01:01.668461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
251654 
1
 
8947
ValueCountFrequency (%) 
025165496.6%
 
189473.4%
 
2020-09-29T11:01:01.719433image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
255849 
1
 
4752
ValueCountFrequency (%) 
025584998.2%
 
147521.8%
 
2020-09-29T11:01:01.764444image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
242840 
1
 
17761
ValueCountFrequency (%) 
024284093.2%
 
1177616.8%
 
2020-09-29T11:01:01.813961image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
240986 
1
 
19615
ValueCountFrequency (%) 
024098692.5%
 
1196157.5%
 
2020-09-29T11:01:01.860758image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
194151 
1
66450 
ValueCountFrequency (%) 
019415174.5%
 
16645025.5%
 
2020-09-29T11:01:01.907065image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
238447 
1
 
22154
ValueCountFrequency (%) 
023844791.5%
 
1221548.5%
 
2020-09-29T11:01:01.953321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
249502 
1
 
11099
ValueCountFrequency (%) 
024950295.7%
 
1110994.3%
 
2020-09-29T11:01:01.999705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
256468 
1
 
4133
ValueCountFrequency (%) 
025646898.4%
 
141331.6%
 
2020-09-29T11:01:02.045722image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
256696 
1
 
3905
ValueCountFrequency (%) 
025669698.5%
 
139051.5%
 
2020-09-29T11:01:02.095597image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
v
250939 
a
 
5512
w
 
2677
r
 
1473
ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 
2020-09-29T11:01:02.188138image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T11:01:02.288349image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:01:02.398506image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

count_families
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9839486418
Minimum0
Maximum9
Zeros20862
Zeros (%)8.0%
Memory size2.0 MiB
2020-09-29T11:01:02.514749image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4183889779
Coefficient of variation (CV)0.425214244
Kurtosis17.67094319
Mean0.9839486418
Median Absolute Deviation (MAD)0
Skewness1.634757873
Sum256418
Variance0.1750493368
MonotocityNot monotonic
2020-09-29T11:01:02.619789image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
122611586.8%
 
0208628.0%
 
2112944.3%
 
318020.7%
 
43890.1%
 
5104< 0.1%
 
622< 0.1%
 
77< 0.1%
 
94< 0.1%
 
82< 0.1%
 
ValueCountFrequency (%) 
0208628.0%
 
122611586.8%
 
2112944.3%
 
318020.7%
 
43890.1%
 
ValueCountFrequency (%) 
94< 0.1%
 
82< 0.1%
 
77< 0.1%
 
622< 0.1%
 
5104< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
231445 
1
29156 
ValueCountFrequency (%) 
023144588.8%
 
12915611.2%
 
2020-09-29T11:01:02.702972image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
243824 
1
 
16777
ValueCountFrequency (%) 
024382493.6%
 
1167776.4%
 
2020-09-29T11:01:02.755981image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
251838 
1
 
8763
ValueCountFrequency (%) 
025183896.6%
 
187633.4%
 
2020-09-29T11:01:02.807820image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
258490 
1
 
2111
ValueCountFrequency (%) 
025849099.2%
 
121110.8%
 
2020-09-29T11:01:02.858617image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260356 
1
 
245
ValueCountFrequency (%) 
026035699.9%
 
12450.1%
 
2020-09-29T11:01:02.905853image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260507 
1
 
94
ValueCountFrequency (%) 
0260507> 99.9%
 
194< 0.1%
 
2020-09-29T11:01:03.170150image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260322 
1
 
279
ValueCountFrequency (%) 
026032299.9%
 
12790.1%
 
2020-09-29T11:01:03.226293image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260552 
1
 
49
ValueCountFrequency (%) 
0260552> 99.9%
 
149< 0.1%
 
2020-09-29T11:01:03.272724image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260563 
1
 
38
ValueCountFrequency (%) 
0260563> 99.9%
 
138< 0.1%
 
2020-09-29T11:01:03.317939image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260578 
1
 
23
ValueCountFrequency (%) 
0260578> 99.9%
 
123< 0.1%
 
2020-09-29T11:01:03.364245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
259267 
1
 
1334
ValueCountFrequency (%) 
025926799.5%
 
113340.5%
 
2020-09-29T11:01:03.410263image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

damage_grade
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
2
148259 
3
87218 
1
25124 
ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 
2020-09-29T11:01:03.495237image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T11:01:03.576510image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:01:03.672670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Interactions

2020-09-29T11:00:34.631530image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:34.816846image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:34.997407image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:35.183348image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:35.359090image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:36.086117image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:36.261425image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:36.443382image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:36.619760image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:36.804104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:36.995199image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:37.182276image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:37.369099image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:37.547967image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:37.734361image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:37.909919image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:38.091835image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:38.264281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:38.438875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:38.625036image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:38.812625image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:39.001671image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:39.184770image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:39.374682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:39.569403image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:39.760296image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:39.944111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:40.139185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:40.326262image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:40.525838image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:40.729815image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:40.915151image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:41.111157image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:41.398988image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:41.587695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:41.767746image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:41.948584image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:42.149550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:42.348842image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:42.555903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:42.754140image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:42.956800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:43.181975image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:43.382541image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:43.583434image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:43.777021image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:43.960006image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:44.141239image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:44.324707image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:44.497944image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:44.683042image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:44.857154image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:45.031210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:45.201032image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:45.370639image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:45.555968image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:45.761001image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:45.948920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:46.186558image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:46.427936image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:46.646697image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:46.848420image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:47.033967image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:47.226264image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:47.421505image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:47.614279image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:47.814813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:48.008091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:48.342451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:48.522558image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:48.703075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:48.877111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:49.067350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:49.242596image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:49.415303image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:49.590640image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:49.760729image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:49.940496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:50.114328image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:50.302406image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:50.470552image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-09-29T11:01:03.844366image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-09-29T11:01:04.439645image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-09-29T11:01:04.970950image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-09-29T11:01:05.535739image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-09-29T11:01:06.146514image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-09-29T11:00:51.257173image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T11:00:53.901935image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
080290664871219823065trnfqtd11000000000v1000000000003
1288308900281221087ornxqsd01000000000v1000000000002
29494721363897321055trnfxtd01000000000v1000000000003
3590882224181069421065trnfxsd01000011000v1000000000002
420194411131148833089trnfxsd10000000000v1000000000003
53330208558608921095trnfqsd01000000000v1110000000002
672845194751206622534nrnxqsd01000000000v1000000000003
747551520323122362086twqvxsu00000110000v1000000000001
84411260757721921586trqfqsd01000010000v1000000000002
99895002688699410134tinvjsd00000100000v1000000000001

Last rows

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
26059156080520368598012553nrnfjsd01000000000v1110000000003
260592207683101382190322555trnfqsd01000010000v1000000000002
2605932264218767861325135trnfqsd01000000000v1110000000002
260594159555271811537601312trnfxjd00001000000v1000000000002
2605958270128268471822085trnfqsd01000000000v1000000000003
260596688636251335162115563nrnfjsq01000000000v1000000000002
2605976694851771520602065trnfqsd01000000000v1000000000003
2605986025121751816335567trqfqsd01000000000v1000000000003
26059915140926391851210146trxvsjd00000100000v1000000000002
260600747594219910131076nrnfqjd01000000000v3000000000003